UAMCLyR at RepLab 2014: Author Profiling Task

نویسندگان

  • Esaú Villatoro-Tello
  • Gabriela Ramírez-de-la-Rosa
  • Christian Sánchez-Sánchez
  • Héctor Jiménez-Salazar
  • Wulfrano Arturo Luna-Ramírez
  • Carlos Rodríguez-Lucatero
چکیده

This paper describes the participation of the Language and Reasoning Group of UAM at RepLab 2014 Author Profiling evaluation lab. This task involves author categorization and author ranking subtasks. Our method for author categorization uses a supervised approach based on the idea that we can use the information on Twitter’s user profile, then by means of employing an attribute selection techniques we can extract attributes that are the most representative from each user’s activity domain. For the author ranking subtask we use a two step chained method that uses stylistics attributes (e.g. lexical richness, language complexity) and behavioral attributes (e.g. posts’ frequency, directed tweets) extracted from the users’ profile and the posts. We use these attributes in conjunction with a Markov Random Fields for improving an initial ranking given by the confidence of Support Vector Machine classification algorithm. Obtained results are encouraging and motivate us to keep working on the same ideas.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UNED at CLEf RepLab: Author Profiling

This paper describes a learning system developed for the RepLab 2014 author profiling task at UNED. The system uses a voting model, which employs a small set of features based mainly on the tweet text information such as POS tags, number of hashtags or number of links. In the unofficial run, the feature set was increased with Twitter metadata such as number of followers or retweet speed. The sy...

متن کامل

UAMCLyR at RepLab 2013: Profiling Task

This paper describes the participation of the Language and Reasoning Group of UAM at RepLab 2013 Profiling evaluation lab. We adopted Distributional Term Representations (DTR) for facing the following problems: i) filtering tweets that are related to an entity, and ii) identifying positive or negative implications for the entity’s reputation, i.e., polarity for reputation. Distributional Term R...

متن کامل

UAMCLyR at Replab2013: Monitoring Task

In this article we deal with the Topic Detection and Priority Detection subtasks from RepLab 2013, trying clustering and classification methods as well as term selection techniques in order to know its performance in two sub collections of tweets: single and extended (single tweet plus derived tweets). Our tests show good performance in spite of we used very few resources.

متن کامل

University of Tehran at RepLab 2014

In this paper, we present our approach to author ranking subtask; which is a part of author-profiling task in RepLab 2014. In this subtask, systems are expected to detect influential authors and opinion makers on Twitter website. The systems’ output, for a given domain, must be a ranked list of authors according to their probability of being an influential author or opinion maker. Our system ut...

متن کامل

Overview of the Author Profiling Task at PAN 2014

This overview presents the framework and the results for the Author Profiling task at PAN 2014. Objective of this year is the analysis of the adaptability of the detection approaches when given different genres. For this purpose a corpus with four different parts (subcorpora) has been compiled: social media, Twitter, blogs, and hotel reviews. The construction of the Twitter subcorpus happened i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014